59 research outputs found

    Replication study: Development and validation of deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs

    Get PDF
    Replication studies are essential for validation of new methods, and are crucial to maintain the high standards of scientific publications, and to use the results in practice. We have attempted to replicate the main method in 'Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs' published in JAMA 2016; 316(22). We re-implemented the method since the source code is not available, and we used publicly available data sets. The original study used non-public fundus images from EyePACS and three hospitals in India for training. We used a different EyePACS data set from Kaggle. The original study used the benchmark data set Messidor-2 to evaluate the algorithm's performance. We used the same data set. In the original study, ophthalmologists re-graded all images for diabetic retinopathy, macular edema, and image gradability. There was one diabetic retinopathy grade per image for our data sets, and we assessed image gradability ourselves. Hyper-parameter settings were not described in the original study. But some of these were later published. We were not able to replicate the original study. Our algorithm's area under the receiver operating curve (AUC) of 0.94 on the Kaggle EyePACS test set and 0.80 on Messidor-2 did not come close to the reported AUC of 0.99 in the original study. This may be caused by the use of a single grade per image, different data, or different not described hyper-parameter settings. This study shows the challenges of replicating deep learning, and the need for more replication studies to validate deep learning methods, especially for medical image analysis. Our source code and instructions are available at: https://github.com/mikevoets/jama16-retina-replicationComment: The third version of this paper includes results from replication after certain hyper-parameters were published in later article. 16 pages, 6 figures, 1 table, presented at NOBIM 201

    MORTAL - Multiparadigm Optimizing Retargetable Transdisciplinary Abstraction Language

    Get PDF
    This short paper describes MORTAL, a new general-purpose programming language and compiler for high-performance scientific applications. MORTAL aims to bridge the knowledge gap between computer scientists and scientists by offering a multiparadigm programming environment that allows connecting the mathematical formulae written by scientist to algorithms implemented by the software engineer in a natural way, and understood by both. We provide the rationale for MORTAL, give an overview of the language design and the MORTAL compiler. The compiler is self-hosting, and our initial evaluation shows that MORTAL programs have similar performance as C programs

    nsroot: Minimalist Process Isolation Tool Implemented With Linux Namespaces

    Get PDF
    Data analyses in the life sciences are moving from tools run on a personal computer to services run on large computing platforms. This creates a need to package tools and dependencies for easy installation, configuration and deployment on distributed platforms. In addition, for secure execution there is a need for process isolation on a shared platform. Existing virtual machine and container technologies are often more complex than traditional Unix utilities, like chroot, and often require root privileges in order to set up or use. This is especially challenging on HPC systems where users typically do not have root access. We therefore present nsroot, a lightweight Linux namespaces based process isolation tool. It allows restricting the runtime environment of data analysis tools that may not have been designed with security as a top priority, in order to reduce the risk and consequences of security breaches, without requiring any special privileges. The codebase of nsroot is small, and it provides a command line interface similar to chroot. It can be used on all Linux kernels that implement user namespaces. In addition, we propose combining nsroot with the AppImage format for secure execution of packaged applications. nsroot is open sourced and available at: https://github.com/uit-no/nsroo

    The Beauty of Complex Designs

    Get PDF
    The increasing use of omics data in epidemiology enables many novel study designs, but also introduces challenges for data analysis. We describe the possibilities for systems epidemiological designs in the Norwegian Women and Cancer (NOWAC) study and show how the complexity of NOWAC enables many beautiful new study designs. We discuss the challenges of implementing designs and analyzing data. Finally, we propose a systems architecture for swift design and exploration of epidemiological studies

    Cancer detection for white urban Americans

    Get PDF
    Poster presentation at the NORA Annual Conference 2023 05.06. - 06.06.23, Tromsø, Norway.Development, validation and comparison of machine learning methods require access to data, sometimes lots of data. Within health applications, data sharing can be restricted due to patient privacy, and the few publicly available data sets become even more valuable for the machine learning community. One such type of data are H&E whole slide images (WSI), which are stained tumour tissue, used in hospitals to detect and classify cancer, see Fig. 1. The Cancer Genome Atlas (TCGA) has made an enormous contribution to publicly available data sets. For breast cancer H&E WSI they are by far the largest data set, with more than 1,000 patients, twice as many as the second largest contributor, the two Camelyon competition data sets [1] with 399 + 200 patients

    nsroot: Minimalist process isolation tool implemented with Linux namespaces

    Get PDF
    services run on large computing platforms.. This creates a need to package tools and dependencies for easy installation,, configuration and deployment on distributed platforms.. In addition,, for secure execution there is a need for process isolation on a shared platform.. Existing virtual machine and container technologies are often more complex than trad itional Unix utilities,, like chroot,, and often require root privileges in order to set up or use.. This is especially challenging on HPC systems where users typically do not have root access.. We therefore present nsroot,, a lightweight Linux namespaces based process isolation tool.. It allows restricting the runtime environment of data analysis tools that may not have been designed with security as a top priority,, in order to reduce the risk and consequences of security breaches,, without requiring any special privileges.. The codebase of nsroot is small,, and it provides a command line interface similar to chroot.. It can be used on all Linux kernels that implement user namespaces.. In addition,, we propose combining nsroot with the AppImage format for secure execu tion of packaged applications.. nsroot is open sourced and available at:: https://github.com/uit-no/nsroot

    GeneNet VR: Interactive visualization of large-scale biological networks using a standalone headset

    Get PDF
    Visualizations are an essential part of biomedical analysis result interpretation. Often, interactive networks are used to visualize and interpret the data. However, the high interconnectivity, and high dimensionality of the data often results in information overload, making it hard to interpret the results. To address the information overload problem, existing solutions typically either use data reduction, reduced interactivity, or expensive hardware. We propose using the affordable Oculus Quest Virtual Reality (VR) standalone headset for interactive visualization of large-scale biological networks. We present the design and implementation of our solution, GeneNet VR, and we evaluate its scalability and usability using large gene-to-gene interaction networks. We achieve the 72 FPS required by the Oculus' performance guidelines for the largest of our networks (2693 nodes) using both a GPU and the Oculus Quest standalone. We found from our interviews with biomedical researchers that GeneNet VR is innovative, interesting, and easy to use for novice VR users. We believe a?ordable hardware like the Oculus Quest has a big potential for biological data analysis. However, additional work is required to evaluate its bene?ts to improve knowledge discovery for real data analysis use cases. GeneNet VR is open-sourced: https://github.com/kolibrid/GeneNet-VR. A video demonstrating GeneNet VR used to explore large biological networks: https://youtu.be/N4QDZiZqVNY

    Evaluating the performance of the allreduce collective operation on clusters. Approach and results

    Get PDF
    The performance of the collective operations provided by a communication library is important for many applications run on clusters. The communication structure of collective operations can be organized as a tree. Performance can be improved by configuring and mapping the tree to the clusters in use. We describe and demonstrate an approach for evaluating the performance of different configurations and mappings of allreduce run on clusters of different size, consisting of single-CPU hosts, and SMPs with a different number of CPUs. A breakdown of the cost of allreduce using the best configuration on different clusters is provided. For all, the broadcast part is more expensive than the reduce part. Inter-host communication contributes more to the time per allreduce than the synchronization in the allreduce components. For the small messages sizes used (4 and 256 bytes), the time spent computing the partial reductions is insignificant. Reconfiguring hierarchy aware trees improved performance up to a factor of 1.49, by avoiding scalability problems of the components on SMPs, and by finding the right balance between available concurrency, load on 'root' hosts and the number of network links in a tree. Extending a tree by adding more threads, or by combining two trees does not have a negative influence on the performance of a configuration, but increasing message size does

    Work Extraction and Landauer's Principle in a Quantum Spin Hall Device

    Get PDF
    Landauer's principle states that erasure of each bit of information in a system requires at least a unit of energy kBTln2k_B T \ln 2 to be dissipated. In return, the blank bit may possibly be utilized to extract usable work of the amount kBTln2k_B T \ln 2, in keeping with the second law of thermodynamics. While in principle any collection of spins can be utilized as information storage, work extraction by utilizing this resource in principle requires specialized engines that are capable of using this resource. In this work, we focus on heat and charge transport in a quantum spin Hall device in the presence of a spin bath. We show how a properly initialized nuclear spin subsystem can be used as a memory resource for a Maxwell's Demon to harvest available heat energy from the reservoirs to induce charge current that can power an external electrical load. We also show how to initialize the nuclear spin subsystem using applied bias currents which necessarily dissipate energy, hence demonstrating Landauer's principle. This provides an alternative method of "energy storage" in an all-electrical device. We finally propose a realistic setup to experimentally observe a Landauer erasure/work extraction cycle.Comment: Accepted for publication PRB, 9 pages, 4 figures, RevTe
    corecore